Discovering, Learning and Exploiting Relevance

نویسندگان

  • Cem Tekin
  • Mihaela van der Schaar
چکیده

In this paper we consider the problem of learning online what is the information to consider when making sequential decisions. We formalize this as a contextual multi-armed bandit problem where a high dimensional (D-dimensional) context vector arrives to a learner which needs to select an action to maximize its expected reward at each time step. Each dimension of the context vector is called a type. We assume that there exists an unknown relation between actions and types, called the relevance relation, such that the reward of an action only depends on the contexts of the relevant types. When the relation is a function, i.e., the reward of an action only depends on the context of a single type, and the expected reward of an action is Lipschitz continuous in the context of its relevant type, we propose an algorithm that achieves Õ(T ) regret with a high probability, where γ = 2/(1 + √ 2). Our algorithm achieves this by learning the unknown relevance relation, whereas prior contextual bandit algorithms that do not exploit the existence of a relevance relation will have Õ(T ) regret. Our algorithm alternates between exploring and exploiting, it does not require reward observations in exploitations, and it guarantees with a high probability that actions with suboptimality greater than are never selected in exploitations. Our proposed method can be applied to a variety of learning applications including medical diagnosis, recommender systems, popularity prediction from social networks, network security etc., where at each instance of time vast amounts of different types of information are available to the decision maker, but the effect of an action depends only on a single type.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Action-Dependent Relevance : Learning from Logged Data

In many learning problems, the decision maker is provided with various (types of) context information that she might utilize to select actions in order to maximize performance/rewards. But not all information is equally relevant: some context information may be more relevant to the decision problem at hand. Discovering and exploiting the most relevant context information speeds up learning, red...

متن کامل

Rectifying Self Organizing Maps for Automatic Concept Learning from Web Images

We attack the problem of learning concepts automatically from noisy web image search results. Going beyond low level attributes, such as colour and texture, we explore weakly-labelled datasets for the learning of higher level concepts, such as scene categories. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the ...

متن کامل

ConceptMap: Mining Noisy Web Data for Concept Learning

We attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned f...

متن کامل

Discovering Missing Values in Semi-Structured Databases

We explore the problem of discovering multiple missing values in a semi-structured database. For this task, we formally develop Structured Relevance Model (SRM) built on one hypothetical generative model for semi-structured records. SRM is based on the idea that plausible values for a given field could be inferred from the context provided by the other fields in the record. Small-scale experime...

متن کامل

Learning Uncertain Rules with CONDORCKD

CONDORCKD is a system implementing a novel approach to discovering knowledge from data. It addresses the issue of relevance of the learned rules by algebraic means and explicitly supports the subsequent processing by probabilistic reasoning. After briefly summarizing the key ideas underlying CONDORCKD, the purpose of this paper is to present a walk-through and system demonstration.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014